* Quick Note: The excel file used for the notes has a lot of NA values, don't worry, that's just the way the file was,
R has the ability to read and write to excel, which makes it very convenient to work on the same datasets as business analysts or colleagues who only know excel, meaning they can work with excel and hand you the files, then you work with them in R!
To do this, we need the readxl package for R. Remember you can download it by using:
install.packages('readxl')
You may need to specify repos="http://cran.rstudio.com/" as an argument in the packages call if you get a mirror error.
Let's see how we can use this:
# In case you don't have readxl (you may not need to specify repos)
#install.packages('readxl',repos="http://cran.rstudio.com/")
# Load the readxl package
library(readxl)
# list the sheets of the excel file
excel_sheets('Sample-Sales-Data.xlsx')
# Call info from the sheets using read_excel
df <- read_excel('Sample-Sales-Data.xlsx',sheet='Sheet1')
head(df)
Now you can perform operations on it just like a normal R data frame:
sum(df['Value'])
str(df)
summary(df)
If you had multiple sheets that you wanted to import into a list, you could do this with lapply():
entire_workbook <- lapply(excel_sheets("Sample-Sales-Data.xlsx"),
read_excel,
path = "Sample-Sales-Data.xlsx")
# Show entire list:
#entire_workbook
Writing to excel requires the xlsx package:
install.packages('xlsx',repos="http://cran.rstudio.com/")
library(xlsx)
df <- data.frame(matrix(1:10))
write.xlsx(df, "output.xlsx")
read_excel('output.xlsx')
That's it for the basics of reading and writing with excel files, depending on your excel files, you may need to use help() to find out additional arguments you may need to get exactly what you want. Because students may or may not use excel, we will stick to using csv files for the course!